Information Flow Analysis with Chinese Text
نویسندگان
چکیده
This article investigates the effectiveness of an information inference mechanism on Chinese text. The information inference derives implicit associations via computation of information flow on a high dimensional conceptual space, which is approximated by a cognitively motivated lexical semantic space model, namely Hyperspace Analogue to Language (HAL). A dictionary-based Chinese word segmentation system was used to segment words. To evaluate the Chinese-based information flow model, it is applied to query expansion, in which a set of test queries are expanded automatically via information flow computations and documents are retrieved. Standard recall-precision measures are used to measure performance. Experimental results for TREC-5 Chinese queries and People Daily’s corpus suggest that the Chinese information flow model significantly increases average precision, though the increase is not as high as those achieved using English corpus. Nevertheless, there is justification to believe that the HALbased information flow model, and in turn our psychologistic stance on the next generation of information processing systems, have a promising degree of language independence.
منابع مشابه
Preferred Argument Structure in Chinese: A Comparison Among Conversations, Narratives and Written Texts
The purpose of this study is to investigate the relationship between information flow and preferred argument structure across different text types. A number of studies in both ergative and accusative languages confirm Du Bois’ (1987) grammatical constrains. Chinese is neither an ergative nor accusative language. The results of my Chinese data do not truly confirm Du Bois’ constraints. Transitiv...
متن کاملA MMSM-based Hybrid Method for Chinese MicroBlog Word Segmentation
After years of researches, Chinese word segmentation has achieved quite high precisions for formal style text. However, the performance of segmentation is not so satisfying for MicroBlog corpora. In this paper we describe a scheme for Chinese word segmentation for, MicroBlog which integrates the characterbased and word-based information in the directed graph generated by MMSM model. Word-level ...
متن کاملPreferred Argument Structure For Discourse Understanding
The main purpose of communication is to exchange information. Any discourse understanding model should be able to process the flow of information throughout the entire text. According to Du Bois (1987)'s studies of information flow in discourse across a number of languages, information distribution among argument positions in clauses is by no means random, but cemdn grammatical patterns tend to...
متن کاملAn intertextual study of the role of
Proving the influence of the Chinese pictorial tradition on Iranian art after the rule of the Mongol patriarchs has become an undeniable theory among researchers. The existence of dragons, ivy and ivy forms have been attributed to the Chinese; But this article adopted a different approach. Based on the belief that this world is the world of texts, he intends to read a sub-text of the large text...
متن کاملDeveloping Chinese TAK for Computer Directly
With the development of text analysis, the quality of the computer-used knowledge is more and more crucial to the analysis accuracy, and the text analysis knowledge (TAK) has also developed by many researchers. But so far, except the lexicon, TAK for computer (such as phrase structure grammar, unregistered word recognition rule, etc) is done on a small scale. Although large scale corpus with wo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004